Pushing the Boundaries of Crowd-enabled Databases with Query-driven Schema Expansion
نویسندگان
چکیده
By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time. However, the number of crowd-sourced mini-tasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowdsourcing to obtain all values individually, we leverage the usergenerated data found in the Social Web: By exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from the perceptual space with high quality and at low costs. Extensive experiments show that our approach can boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a query-driven fashion.
منابع مشابه
A Data-driven Method for Crowd Simulation using a Holonification Model
In this paper, we present a data-driven method for crowd simulation with holonification model. With this extra module, the accuracy of simulation will increase and it generates more realistic behaviors of agents. First, we show how to use the concept of holon in crowd simulation and how effective it is. For this reason, we use simple rules for holonification. Using real-world data, we model the...
متن کاملEfficient Recursive XML Query Processing in Relational Database Systems
Recursive queries are quite important in the context of XML databases. In addition, several recent papers have investigated a relational approach to store XML data and there is growing evidence that schema-conscious approaches are a better option than schema-oblivious techniques as far as query performance is concerned. However, the issue of recursive XML queries for such approaches has not bee...
متن کاملOptimization techniques for human computation-enabled data processing systems
Crowdsourced labor markets make it possible to recruit large numbers of people to complete small tasks that are difficult to automate on computers. These marketplaces are increasingly widely used, with projections of over $1 billion being transferred between crowd employers and crowd workers by the end of 2012. While crowdsourcing enables forms of computation that artificial intelligence has no...
متن کاملNoSym: Non-Symbolic Databases for Data Decoupling
Under the Unique Name Assumption (UNA), users need to have shared agreements on signifiers to use in schema or data, e.g. to use “genre” and not “type” to refer to a movie’s category. Agreements are difficult in open environments such as datasets on the web, open data, and crowd-sourced databases, thus this assumption can be invalid. Schema matching and data integration can be limited in respon...
متن کاملQuery expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 5 شماره
صفحات -
تاریخ انتشار 2012